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ABSTRACT 

After dividing voice signals into subframes, the 
voice coder of the present, invention calculates auditory 
sense masking threshold values for each subframe with a 
masking threshold value calculating circuit/ and 
transforms the auditory sense masking- threshold values 
to auditory sense weighting filter coefficients. An 
auditory sense weighting circuit performs auditory sense 
weighting to the signals using the auditory sense 
weighting filter coefficients and searches excitation 
codebooks or multipulses using auditory sense weighted 
signals . 
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VOICE CODER AND A METHOD FOR SEARCHING CODEBOOKS 

BACKGROUND OF THE INVENTION 

The present invention relates to voice coding technics 
for encoding voice signals in high quality at low bit 
rates, especially at 8 to 4.8 kb/s. 

As a method for coding voice signals at low bit rates 
of about 8 to 4.8 kb/s, for example, there is CELP (Code 
Excited LPC Coding ) method described in the paper titled 
"Code-excited linear prediction: High quality speech at 
very low bit rates" (Proc. ICASSP, pp. 937-940, 1985) by M. 
Sahroeder and B. Atal (reference No.l) and the paper titled 
"Improved speech quality and efficient vector quantization 
in SELP " (ICASSP, pp. 155-158, 1988) by Kleijn et al. 
(reference No. 2) . 

In the method described in these papers, spectral 
parameters representing spectral characteristics of voice 
signals are extracted in the transmission side from voice 
signals for each frame (20ms, for example) . Then, the 
frames are divided into subframes (5ms, for example), and 
pitch parameters of an adaptive codebook representing long- 
term correlation (pitch correlation) are extracted so as to 
minimize a weighted squared error between a signal 
regenerated based on a past excitation signal for each 
subframe and the voice signal. Next, the subframe's voice 
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signals are predicted in long-terra based on these pitch 
parameters, and based on residual signals calculated 
through this long-term prediction, one kind of noise 
signals is selected so as to minimize weighted squared 
error between a signal synthesized from signals selected 
from a codebook consisting of pre-set kinds of nose 
signals and the voice signal, and an optimal gain is 
calculated. Then, an index representing a type of the 
selected noise signal, gain, the spectral parameter and 
the pitch parameters are transmitted. 

In addition, as another method for coding voice 
signals at low bit rates of about 8 to 4.8 kb/s, the 
multi-pulse coding method described in the paper titled 
"A new model of LPC excitation for producing natural- 
sounding speech at low bit rates 1 ' (Proc. ICASSP, pp. 614- 
617, 1982) by B. Atal et al . (reference No. 3) etc. is 
known . 

In the method of reference No. 3, the residual 
signal of above-mentioned method is represented by a 
multi-pulse consisting of a pre-set number of pulse 
strings of which amplitude and locations are different 
from others, amplitude and location of the multi-pulse 
are calculated. Then, amplitude and location of the 
multi-pulse, the spectral parameter and the pitch 
parameters are transmitted. 
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In the prior art described in references No.l, No. 2 
and No. 3, as an error evaluation criterion, a weighted * 
squared error between a supplied voice signal and a 
regenerated signal from the codebook or the multi-pulse 
is used when searching a codebook consisting of multi- 
pulses, adaptive codebook and noise signals. 

The following equation shows such a weighted scale 
criterion. 

vv(z) = [1 - 5>i7/^Ml -taj^] (1) 

i«l i>l 

Where, W(z) represents transfer characteristics of 
a weighting filter, a^ is a linear prediction 
coefficient calculated from a spectral parameter, y 1 i , 
7 2 1 are constants for controlling weighting quantity, 
they are set in 0< y l^ l f usually. 

However, there is a problem that speech quality of 
regenerated voices using code vectors selected with this 
criterion or calculated multi-pulses do not always fit 
to natural auditory feeling because this evaluation 
criterion does not match with natural auditory feeling. 

Moreover, this problem becomes particularly 
noticeable when bit rate was reduced and the codebook 
was reduced in size. 

Furthermore, in the above-mentioned prior art, the 
number of bits of codebook in each subframe is supposed 
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constant when searching a codebook consisting of noise 
signals. Additionally, the number of multipluses in a 
frame or a subframe is also constant when calculating a 
multipulse . 

However, power of voice signals remarkably varies 
as time passes, so it has been difficult to code voices 
in high quality by a method using constant number of 
bits where power of voice signals varies as time passes. 
Especially, this problem becomes serious under the 
conditions that bit rates are reduced and sizes of 
codebooks are minimized. 

SUMMARY OF THE INVENTION 

It is an object of the present invention to solve 
the above-mentioned problems. 

Another object of the present invention is to 
provide a voice coding art matching auditory feeling. 

Moreover, another object of the present invention 
is to provide a voice coding art enabling to reduce bit 
rates than prior art. 

The above-mentioned objects of the present 
invention is achieved by a voice coder comprising a 
masking calculating means for calculating masking 
threshold means from supplied discrete voice signals 
based on auditory sense masking characteristics, 
auditory sense weighting means for calculating filter 
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coefficients based on the masking threshold values and 
weighting input signals based on the filter 
coefficients, a plurality of codebooks, each of them 
consisting of a plurality of code vectors/ and a 
searching means for searching a code vector that 
minimizes output signal power of the auditory sense 
weighting means from the codebooks. 

The voice coder of the present invention performs, 
for each of subframes created by dividing frames, 
auditory sense weighting calculated based on auditory 
sense masking characteristics to signals supplied to 
adaptive codebooks, excitation codebooks or multi-pulse 
when searching adaptive codebooks and excitation 
codebooks or calculating multi-pulses. 

In auditory sense weighting, masking threshold 
values are calculated based on auditory sense masking 
characteristics, an error scale is calculated by 
performing auditory sense weighting to supplied signals 
based on the masking threshold values. Then, an optimal 
code vector is calculated from the codebooks so as to 
minimize the error scale. Namely, a code vector that 
minimizes weighted error power as shown in the following 
equation. 

E = ]£[(*<«> - YjCj(n) * h(n) * w M {n)f (2 , 

«s0 
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In accordance with the present invention there is 
provided a voice coder comprising: masking calculating means 
for calculating masking threshold values from supplied 
discrete voice signals based on auditory sense masking 
characteristics; auditory sense weighting means for 
calculating filter coefficients based on said masking 
threshold values and weighting input signals based on said 
filter coefficients j a codebook which -includes a plurality of 
code vectors; and a searching means for searching for a code 

10 vector in the codebook that minimizes error signal power 
between an output signal of said auditory sense weighting 
means and the code vectors in said codebook. 

In accordance with the present invention there is 
further provided a voice coder comprising: dividing means for 
dividing supplied discrete voice signals into first pre-set 
time length frames; subframe generating means for generating 
subframes by dividing said frames into second pre-set time 
length divisions; regenerating means for regenerating said 
voice signals for said subframes based on an adaptive 

20 codebook; masking calculating means for calculating masking 

threshold values for each of said subframes from said voice 
signals based on auditory sense masking characteristics; an 
auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
performing auditory sense weighting to a difference signal 
formed as a difference between a signal regenerated with said 
regenerating means and said voice signal based on said filter 
coefficients; an excitation codebook which includes a 

- 5a - 
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plurality of code vectors; and searching means for searching 
for a code vector in said excitation codebook that minimizes 
an error signal power between said auditory sense weighting 
means and the code vectors in said excitation codebook. 

In accordance with the present invention there is 
further provided a voice coder comprisingi dividing means for 
dividing supplied discrete voice signals into pre-set time 
length frames; subframe generating means for generating 
subframes by dividing said frames into pre-set time length 
divisions; masking calculating means for calculating masking 
threshold values for each of said subframes from said voice 
signals based on auditory sense masking characteristics; 
auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
performing auditory sense weighting to said voice signals 
based on said filter coefficients; adaptive codebook means for 
calculating an adaptive code vector that minimizes power of a 
difference signal formed as a difference between a response 
signal and a voice signal weighted with said auditory sense 
weighting means; an excitation codebook which Includes a 
plurality of excitation code vectors; and searching means for 
searching for a code vector in said excitation codebook that 
minimizes an error signal power between an output signal 
generated from said adaptive codebook means and said 
difference s ignal . 

In accordance with the present invention there is 
further provided a voice coder comprising! dividing means for 
dividing supplied discrete voice signals into pre-set time 

- 5b - 
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length frames j subframe generating means for generating 
subframes by dividing said frames into pre-set time length 
divisions } regenerating means for regenerating said voice 
signals for each of said subframes based on an adaptive 
codebook; masking calculating means for calculating masking 
threshold values from said voice signals based on auditory 
sense masking characteristics; auditory sense weighting means 
for calculating filter coefficients based on said masking 
threshold values and performing auditory sense weighting to an 

10 error signal formed as a difference between said voice signal 
and a signal regenerated with said regenerating means based on 
said filter coefficients; and calculating means for 
calculating a multi-pulse that minimizes an error signal power 
between an output signal of said auditory sense weighting 
means and said code vectors in said adaptive codebook. 

In accordance with the present invention there is 
further provided a method for searching a codebook used for 
coding discrete voice signals, using signals weighted with 
masking threshold values calculated from said voice signals 

20 based on auditory sense masking characteristics, the method 
comprising the steps of: (a) dividing said voice signals 
into pre-set time length frames; (b) generating subframes by 
dividing said frames into pre-set time length divisions; 
(c) regenerating said voice signals for each of said 
subframes based on an adaptive codebook; (d) calculating 
masking threshold values from said voice signals based on 
auditory sense masking characteristics; (e) calculating 
filter coefficients based on said masking threshold values and 

- 5c - 
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performing auditory sense weighting to an error signal between 
a signal regenerated in the step (c) and said voice signal, 
based on said filter coefficients; and (f) searching for an 
excitation code vector in an excitation codebook that 
minimizes the error signal power weighted in the step (e). 

In accordance with the present invention there is 
further provided the method for searching a codebook if used 
for coding discrete voice signals, using signals weighted with 
masking threshold values calculated from said voice signals 
10 based on auditory sense masking characteristics, the method 
comprising the steps of: (1) dividing said voice signals 
into pre-set time length frames ; (2) generating subframes by 
dividing said frames into pre-set time length divisions; 

(3) calculating masking threshold values from said voice 
signals based on auditory sense masking characteristics; 

(4) calculating filter coefficients based on said masking 
threshold value and performing auditory sense weighting to 
said voice signal based on said filter coefficients; 

(5) calculating, for each of said subframes and using a 

20 difference signal formed as a difference between a response 
signal and a voice signal weighted in the step (4), an 
adaptive code vector that minimizes a power of said difference 
signal, and regenerating said voice signal; and 

(6) searching for an excitation code vector In an excitation 
codebook that minimizes an error signal power between a signal 
regenerated in the step (5), and said voice signal. 

In accordance with the present invention there is 
further provided a voice coder comprising! dividing means for 

- 5d - 
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dividing supplied discrete voice signals into frames of a 
first pre-set time length and further dividing said frames 
into subframes of a second pre-set time length smaller than 
said first pre-set time length; masking calculating means for 
calculating masking threshold values from said voice signals 
based on auditory sense masking characteristics; a plurality 
of codebooks of which bit numbers are different from each 
other; bit number allocating means for allocating a number of 
bits of said codebooks based on said masking threshold values; 

10 and searching means for searching a code vector by switching 
said codebooks for each of said subframes based on the 
allocated number of bits. 

In accordance with the present invention there is 
further provided a voice coder comprising: dividing means for 
dividing supplied discrete voice signals into frames of a pre- 
set time length; masking calculating means for calculating 
masking threshold values from said voice signals based on 
auditory sense masking characteristics; pitch calculating 
means for calculating pitch parameters so as to make signals 

20 regenerated based on said adaptive codebooks made of past 

excitation signals approximate, for each of said subframes, 
said voice signals; auditory sense weighting means for 
calculating filter coefficients based on said masking 
threshold values and conducting auditory sense weighting to 
error signals between signals regenerated with said pitch 
calculating means and said voice signals based on said filter 
coefficients; a plurality of excitation codebooks of which bit 
numbers are different from each other; bit allocating means 

- 5e - 
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for allocating a bit number of said excitation codebooks for 
each of said subframes based on said masking threshold values; 
and searching means for switching said excitation codebooks 
for each of said subframes based on the allocated number of 
bits and searching for an excitation code vector minimizing an 
error signal power between an output signal generated from 
said auditory sense weighting means and code vectors In a 
switched excitation codebook. 

In accordance with the present invention there is 
further provided a voice coder comprising: dividing means for 
dividing supplied discrete voice signals into frames of a 
first pre-set time length and further dividing said frames 
into subframes of a second pre-set time length smaller than 
said first pre-set time length; masking calculating means for 
calculating masking threshold values from said voice signals 
based on auditory sense masking characteristics; deciding 
means for deciding a number of multipulses for each of said 
subframes based on said masking threshold values; and means 
for representing excitation signals of said voice signals in a 
form of multipulse using the number of multipulses decided for 
each of said subframes. 

In accordance with the present invention there is 
further provided a voice coder comprising: dividing means for 
dividing supplied discrete voice signals into frames of a 
first pre-set time length; means for generating subframes by 
dividing said frames into divisions of a second pre-set time 
length; masking calculating means for calculating masking 
threshold values from said voice signals based on auditory 

- 5f - 
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sense masking characteristics; pitch calculating means for 
calculating pitch parameters so as to make signals regenerated 
based on said adaptive codebooks made of past excitation 
signals approximate, for each of said subframes, said voice 
signals; auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
conducting auditory sense weighting to error signals between 
signals regenerated with said pitch calculating means and said 
voice signals based on said filter coefficients; deciding 
means for deciding a number of multipulses for each of said 
subframes based on said masking threshold values; and 
means for calculating a multipulse minimizing said error 
signal power using the number of multipulses decided for each 
of said subframes and representing excitation signals of said 
voice signals using said multipulse. 

In accordance with the present invention there is 
further provided the method of searching codebooks of claim 
42, wherein said codebooks are excitation codebooks. 

In accordance with the present invention there is 
further provided a multipulse calculating method comprising 
the steps of: (a) dividing and subbanding supplied discrete 
voice signals into frames of a first pre-set time length and 
further dividing said frames into subframes of a second pre- 
set time length; (b) calculating masking threshold values 
from said voice signals based on auditory sense masking 
characteristics, and dividing supplied discrete voice signals 
into frames of the first pre-set time length and further 
dividing said frames into subframes of the second pre-set time 

- 5g - 
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length; (c) deciding a number of multipulses for each of said 
subframes based on said masking threshold values; and 
(d) calculating a multipulse minimizing said error signal 
power using a number of multipulses decided for each of said 
subframes and representing excitation signals of said voice 
signals using said multipulse. 



- 5h - 
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This and other objects, features and advantages of 
the present invention will become more apparent upon a 
reading of the following detailed description and 
drawings . 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig.l is a block diagram showing the first 
embodiment of the present invention. 

Fig. 2 is a block diagram showing the second 
embodiment of the present invention. 
10 Fig. 3 is a block diagram showing the third 

embodiment of the present invention. 

Fig. 4 is a block diagram showing the fourth 
embodiment of the present invention. 

Fig. 5 is a block diagram showing the fifth 
15 embodiment of the present invention. 

Fig, 6 is a block diagram showing the sixth 
embodiment . 

Fig. 7 is a block diagram showing the seventh 
embodiment . 

20 Fig. 8 is a block diagram showing the seventh 

embodiment . 

Fig. 9 is a block diagram showing the eighth 
embodiment . 

Fig. 10 is a block diagram showing the ninth 
25 embodiment . 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 
First, the first embodiment of the present 

invention is explained. 

In this first embodiment/ an error signal output 

from an auditory sense weighting filter based on masking 

threshold values is used for searching an excitation 

codebook . 

Fig.l is a block diagram of a voice coder by the 
present invention . 

In the transmission side of Fig.l, voice signals 
are input from an input terminal 100, voice signals of 
one frame (20ms, for example) are stored in a buffer 
memory 110. An LPC analyzer 130 performs well-known LPC 
analysis from one frame voice signal, and calculates LSP 
parameters representing spectral characteristics of 
voice signals for a pre-set number of orders. 

Next, an LSP quantization circuit 140 outputs a 
code l k obtained by quantizing LSP parameters with a 
pre-set quantization bit number to a multiplexer 260. 
Then, it decodes the code 1^, transforms it linear 
prediction coefficient a^ (i = l. to L ), and outputs a 
result to an impulse response calculator 170 and a 
synthesis filter 281. 

It is to be noted that it is possible to refer on 
LSP parameter coding, a transforming method of LSP 
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parameter and linear prediction coefficient to the paper 
titled "Quantizer design in LSP speech analysis- 
synthesis" (IEEE J. Sel. Areas Common., PP. 432-440, 
1988) by Sugamura et al. (reference No. 4 ) and so on. 
Also, it is possible to use vector to scaler 
quantization or other well-known vector quantizing 
methods for more efficiently quantizing LSP parameters. 
On vector to scaler quantization of LSP, it is possible 
to refer to the paper titled "Transform Coding of Speech 
using a Weighted Vector Quantizer" (IEEE J. Sel. Areas, 
Commun., pp. 425-431, 1988) by Moriya et al. (reference 
No. 5) and so on. 

A subframe dividing circuit 150 divides one frame 
voice signal into subframes. As an. example, the subframe 
length is supposed as 5 ms. 

A subtracter 190 subtracts an output wavex(n) of 
the synthesis filter 281 from the voice signal x(n), and 
outputs a signal x 1 (n) . 

The adaptive codebook 210 inputs an input signal 
v(n) of the synthesis filter 281 through a delay circuit 
206, and inputs a weighted impulse response h(n) from an 
impulse response output circuit 170 and the signal X 1 (n) 
from the subtracter 190. Then, it performs long-term 
correlation pitch prediction based on these signals and 
calculates delay M and gain j9 as pitch parameters. 
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Here, adaptive codebook prediction order is 
supposed as 1. However, the value can be 2 or more. 
Moreover, the papers (references No.l, 2) and so on can 
be referred to on calculation of delay M in the adaptive 
codebook . 

Next, using the calculated gain (3, an adaptive 
code vector /? 'v{n-M)*h(n) is calculated. Then, the 
subtracter 195 subtracts the adaptive code vector from 
the signal x*<n), outputs a signal x z (n). 



Where, x z (n) is an error signal, x' (n) is an output 
signal of the subtracter 190, v(n) is a past synthesis 
filter driving signal, h{n) is an impulse response of 
the synthesis filter calculated from linear prediction 
coefficients. 

A masking threshold value calculator 205 calculates 
a spectrum X(k) (k=0 to N-l) by FFT transforming the 
voice signal x(n) at N points, next calculates a power 
spectrum I X(k) I 2, and calculates power or RMS for each 
critical band by analyzing the result using a critical 
band filter or a auditory sense model. The following 
equation is used for power calculation. 



x, (n) = x 1 (n) - p • v(n - M ) * h(n) 



(3) 




(4) 



Where, bl^, bh^ respectively show lower limit 
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frequency and upper limit frequency of i-th critical 
band. R shows number of critical bands included in a 
voice signal band. 

Next, a masking threshold value C(i) in each 
5 critical band is calculated using the values of the 

equation (4), and output. 

Here, as a method of calculating masking threshold 
values, for example, a method using values obtained 
through auditory sense psychological experiments is 
10 known. It is possible to refer in details to the paper 

titled "Transform coding of audio signals using 
perceptual noise criteria" (IEEE J. Sel. Areas on 
Commun., pp. 314-323, 1988) by Johnston et al. (reference 
No. 6) or the paper titled "Vector quantization and 
15 perceptual criteria in SVD based CELP coders" (ICASSP, 
pp. 33-36, 1990) by R. Drogo de lacovo et al. (reference 
No. 7) . 

Moreover, for critical band filters or critical 
band analysis, for example, it is possible to refer to 

20 the fifth chapter (reference No. 8) of the book titled 

"Foundation of modern auditory theory" and so on by J. 
Tobias. In addition, for auditory models, for example, 
it is possible to refer to the paper titled "A 
computational model for the peripheral auditory system: 

25 Application to speech recognition research" (Proc. 
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ICASSP, pp. 1983-1986, 1986) by Seneff (reference No. 9) 
and so on. 

Next, each masking threshold value. sc(i) is 
transformed to power to obtain power spectrum, and auto- 
correlation function r(j)(j=0- \ • N-l) is calculated 
through inverse FFT operation. 

Then, a filter coefficient b± (i=l • • • P ) is 
calculated by operating well-known linear prediction 
analysis to P+l auto-correlation functions. 

The auditory sense weighting circuit 220 operates 
weighting, according to the following equation, to the 
error signal x z (n) obtained by the equation (3) in the 
adaptive codebook 210, using the filter coefficient bi, 
and a weighted signal x zm <n) is obtained. 

*M = x t (n)*W m {n) (5 ) 

Where, W m (n) is an impulse response of an auditory 
sense weighting filter consisting of the filter 
coefficient b± . 

Here, for the auditory sense weighting filter, a 
filter having a transfer function represented by the 
following equation (6) can be used. 

H(z) = [l - J>/Z-' - £ to' Z- J ( 6 ) 

Where, r 2 and rj_ are constants meeting 0^r 2 <r 1 ^l. 
Next, an excitation codebook searching circuit 230 
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selects an excitation code vector so as to minimize the 
following equation (7) . 

N-l 2 

IKW " 0 • Cj(n) * h(n) * WJn)] (1) 
«=o 

Where, /j is an optimal gain to the code vector 
Cj(n) (j=0 • • • • 2 B -1, where B is a number of bits of an 
excitation codebook) . 

It is to be rioted that the excitation codebook 235 
is made in advance through training. For example, for 
the codebook design method by training, it is possible 
to refer to the paper titled "An Algorithm for Vector 
Quantization Design" (IEEE Trans. COM-28, pp. 84-95, 
1980) by Linde et al. (reference No. 10) and so on. 

A gain quantization circuit 282 quantizes gains of 
the adaptive codebook 210 and the excitation codebook 
235 using the gain codebook 285. 

An adder 290 adds an adaptive code vector of the 
adaptive codebook 210 and an excitation code vector of 
the excitation codebook searching circuit 230 as below, 
and outputs a result. 

v(n) = p*v(n-M) + r' i Cj{n) (8) 

A synthesis filter 281 inputs an output v(n) of the 
adder 290, calculates synthesized voices for one frame 
according to the following equation, in addition, inputs 
0 string to the filter for another one frame to 
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calculate a response signal string, and outputs a 
response signal string for one frame to the subtracter 
190. 

ar(«) = K(/i) + £ fl ;x(«-l) (9) 

^ (n) -l0 (*Sn<2*-1) * . < 10 > 

A multiplexer 260 combines output coded strings of 
the LSP quantizer 140, the adaptive codebook 210 and the 
excitation codebook searching circuit 230, and outputs a 
result . 

This is the explanation of the first embodiment. 

Next, the second embodiment is explained. 

Fig. 2 is a block diagram showing the second 
embodiment. In Fig. 2, a component referred with the same 
number as that in Fig.l operates similarly in Fig.l, so 
explanations for it is omitted. 

In the second embodiment, a band dividing circuit 
300 for subbanding in advance input voices is further 
provided to the first embodiment. Here, for simplicity, 
the number of divisions is supposed as two and a method 
using QMF filter is used for the dividing method. Under 
these conditions, signals of lower frequency and that of 
higher frequency are output. 

For example, if letting the frequency band width of 
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input voice be fw(Hz), it is possible to divide a band 
as 0 to fw/2 for the lower band and fw/2 to fw for the 
higher band. 

Then, a switch 310 is pushed over when processing 
lower band signals and pulled down when processing 
higher band signals. 

It is to be noted that, as a method for subbanding 
using QMF filters, for example, it is possible to refer 
to the book titled "Multirate Signal Processing" 
(Prentice-Hall, 1983) by Crochiere et al . (reference 
No. 11) and so on. In addition, as other methods, it it 
possible to consider a method for operating FFT to 
signals and performing frequency dividing on FFT, then 
operating inverse FFT. 

Here, to a voice signal in each band that is 
subbanded, auditory sense weighting filter coefficients 
are calculated in the same manner as the first 
embodiment, performed auditory sense weighting, and 
searching of an excitation codebook is conducted. 

It is possible to prepare two kinds of excitation 
codebooks for the lower band and the higher band and to 
use them by switching. 

This is the explanation for the second embodiment 
of the present invention. 

Next, the third embodiment is explained. 
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The third embodiment further comprises a bit 
allocation section for allocating quantization bit to 
voice signals in subbanded bands in addition to the 
second embodiment. 

Fig. 3 is a block diagram showing the third 
embodiment. In this figure, a component referred with 
the same number as that of Fig.l and Fig. 2 is omitted to 
be explained because is operates similarly in Fig.l and 
Fig. 2. 

In Fig. 3, switches 320-1 and 320-2 switches the 
circuit to the lower band or the higher band, and output 
lower band signals or higher band signals, respectively. 
The switch 320-2 outputs information indicating to where 
an output signal belongs, the lower band or the higher 
band, to the codebook switching circuit 350. 

A masking threshold value calculator 360 calculates 
masking threshold values in all bands for signals that 
are not subbanded yet, and allocates them to the lower 
band or the higher band. Then, the masking threshold 
value calculator 360 calculates auditory sense weighting 
filter coefficients for the lower band or the higher 
band in the same manner as the first embodiment, and 
outputs them to the auditory sense weighting circuit 
220. 

Using outputs of the masking threshold value 
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calculator 360, a bit allocation calculator 340 
allocates a number of quantization bit in the lower band 
and the higher band, outputs results to a codebook 
switching circuit 350. As bit allocation methods, there 
are some methods, for example, a method using a power 
ratio of a subbanded lower band signal and a subbanded 
higher band signal, or a method using a ratio of a lower 
band mean or minimum masking threshold value and a 
higher band mean or minimum masking threshold value when 
calculating masking threshold values in the masking 
threshold value calculator 360. 

The codebook switching circuit 350 inputs a number 
of quantization bits from the allocation circuit 340, 
and inputs lower band information and higher band 
information from the switch 320-2, and switches 
excitation codebooks and gain codebooks. Here, it is 
possible to prepare in advance the codebooks by using 
training data, or the codebook can be a random numbers 
codebook having predetermined stochastic 
characteristics . 

Here, for bit allocation, it is possible to use 
another well-known method such as a method using a power 
ratio of the lower band and the higher band. 

The above is the explanation for the third 
embodiment of the present invention. 
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Next, the fourth embodiment is explained. 

In the fourth embodiment/ a multi-pulse calculator 
300 for calculating multi-pulses is provided, instead of 
the excitation codebook searching circuit 230. 

Fig. 4 is a block diagram of the fourth embodiment. 
In Fig. A, a component referred with the same number as 
that of Fig.l is omitted to be explained, because it 
operates similarly in Fig.l. 

The multi-pulse calculator 300 calculates amplitude 
and location of a multi-pulse that minimizes the 
following equation. 



Where, gj is j-th multi-pulse amplitude, ntj is j-th 
multi-pulse location, k is a number of multi-pulses. 



embodiment of the present invention. 

Next, the fifth embodiment is explained. 

The fifth embodiment is a case of providing the 
auditory sense weighting circuit 220 of the first 
embodiment ahead of the adaptive codebook 210 as shown 
in Fig. 5 and searching an adaptive code vector with an 
auditory sense weighted signal. In addition, auditory 
sense weighting is conducted before searching of an 
adaptive code vector in the fifth embodiment, all 




(U) 



The above is all of explanations for the fourth 
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searching after this step, for example, searching of the 
excitation codebook is also conducted with an auditory 
sense weighted signal. 

Input voice signals are weighted in the auditory 
sense weighting circuit 220 in the same manner as that 
in the first embodiment. The weighted signals are 
subtracted by outputs of the synthesis filter in the 
subtracter 190, input to the adaptive codebook 210. 

The adaptive codebook 210 calculates delay M and 
gain /? of the adaptive codebook that minimizes the 
following equation, 

D = Z nn - M) * h wm (n)) (12) 

Where, x'^ln) is an output signal of the 
subtracter 190, h wm (n) is an output signal of the 
impulse response calculating circuit 170. 

Then, the output signal of the adaptive codebook is 
input to the subtracter 195 in the same manner as the 
first embodiment and used for searching of the 
excitation codebook. 

The above is the explanation of the fifth 
embodiment of the present invention. 

It is to be noted that the critical band analysis 
filters in the above-mentioned embodiments can be 
substituted by the other well-known filters operating 
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equivalently to the critical band analysis filters. 

Also, the calculation methods for the masking 
threshold values can be substituted by the other well- 
known methods . 

Furthermore, the excitation codebook can be 
substituted by the other well-known configurations. For 
the configuration of the excitation codebook, it is 
possible to refer to the paper titled "On reducing 
computational complexity of codebook search in CELP 
coder through the use of algebraic codes" (Proc. ICASSP, 
pp. 177-180, 1990) by C. La fl amine et al . (reference 
No .12) and the paper titled "CELP: A candidate for GSM 
half-rate, coding" (Proc. ICASSP, pp. 469-472, 1990) by I. 
Trancoso et al. (reference No. 13). 

Furthermore, the more effective codebooks by matrix 
quantization, finite vector quantization, trellis 
quantization, delayed decision quantization and so on 
are used, the better characteristics can be obtained. 
For more detailed information, it is possible to refer 
to the paper titled "Vector quantization" (IEEE ASSP 
Magazine, pp. 4-29, 1984) by Gray (reference No. 14) and 
so on. 

The explanation of the above embodiment is of a 1- 
stage excitation codebook. However, the excitation 
codebook could also be multi-staged, for example, 2- 
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staged. This kind of codebook could reduce complexity of 

computations required for searching, 
r 

Also, the adaptive codebook was given as primary, 
but sound quality can be improved to secondary or higher 
degrees or by using decimal value instead of integer as 

i 

delay values. For details, the paper titled, "Pitch 
predictors with high temporal resolution" (Proc. ICASSP, 
pp. 661-664, 1990) by P. Kroon et al . (Reference No. 15), 
and so on can be referred to. 

In the above embodiment, LSP parameters are coded 
as the spectrum parameters and analyzed by LPC analysis, 
but other common parameters, for example, LPC cepstrum, 
cepstrum, improved cepstrum, generalized cepstrum, mel- 
cepstrum or the like can also be used for the spectrum 
parameters . 

Also, the optimal analysis method can be used for 
each parameter. 

In vector quantization of LSP parameters, vector 
quantization can be conducted after nonlinear conversion 
is conducted on LSP parameters to account for auditory 
sense characteristics. A known example of nonlinear 
conversion is Mel conversion. - 

It is also possible to have a configuration by 
which LPC coefficients calculated from frames may be 
interpolated for each subframe in relation to LSP or in 
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relation to linear predictive coefficients and use the 
interpolated coefficients in searches of the adaptive 
codebook and the excitation codebook. Sound quality can 
be further improved with this type of configuration. 

Auditory sense weighting based on the masking" 
threshold values indicated in the embodiments can be 
used for quantization of gain codebook, spectral 
parameters and LSP . 

Also, when determining auditory sense weighting 
filters, it is possible to use masking threshold values 
from simultaneous masking together with masking " 
threshold values from successive masking. 

Furthermore, instead of determining auditory sense 
weighting coefficients directly from masking threshold 
values, it is possible to multiply masking threshold 
values by weighting coefficients and then convert the 
results to auditory sense weighting filter coefficients. 

Other common configurations for auditory sense 
weighting filter can also be used. 

Next, the sixth embodiment is explained. 

Fig. 6 is a block diagram showing the sixth 
embodiment. Here, for simplicity, an example of 
allocating number of bits of codebooks based on masking 
threshold values at searching excitation codebooks is 
shown. However, it can be applied for adaptive codebooks 
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and other types of codebooks. 

In Fig. 6, at transmitting side, voice signals are 
input from an input terminal 600 and one frame of voice 
signals (20 ms, for example) is stored in a buffer 
memory 610. 

An LPC analyzer 630 conducts well-known LPC 
analysis from voice signals of said frames and 
calculates LPC parameters that represent spectral 
characteristics of framed voice signals for a preset 
number of letters L. 

Then, an LSP quantization circuit 640 quantizes the 
LSP parameters in a preset number of quantization bit 
and outputs the obtained code lk to a multiplexer 790. 
The code is decoded and transformed to the linear 
prediction coefficient aj/ (i=l to P) and output to an 
impulse response circuit 670 and a synthetic filter 795. 
For coding method of LSP parameters and transformation 
of LSP parameters and linear prediction coefficients, it 
is possible to refer to the above-mentioned Reference 
No. 4, etc. In addition, for more efficient quantization 
of LSP parameters, vector-sealer quantization or other 
well-known vector quantization methods can be used. For 
LSP vector-sealer quantization, the above-mentioned 
Reference No. 5, etc. can be referred to. 

A subframe dividing circuit 650 divides framed 
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voice signals into subframes. Here, for example, 
subframe length is supposed as 5 ms. 

A masking threshold value calculating circuit 705 
performs FFT transformation to an input signal x (n) of N 
points and calculates a spectrum x(k) (where, k=0 to N- 
1) . Continuously, it calculates power 4 spectrum |x(k) I 2 , 
analyzes the result by using critical filter models or 
auditory sense models and calculates power of each 
critical band or RMS. Here, for calculations of power, 
the following equation is used. 

*(/) = ^\X(kf (i=)toR) (13) 

Here, bl^ and bh^ are lower limit frequency and 
upper limit frequency of i-th critical band, 
respectively. R represents a number of critical bands 
included in a voice signal band. About the critical, 
band, the above-mentioned Reference No. 8 can be referred 
to. 

Then, spreading functions are convoluted in a 
critical band spectrum according to the following 
equation . 

6 mix 

C^^B^prdUJ) (14) 

Here, sprd(j, i) is a spreading function and 
Reference No. 6 can be referred to for its specific 
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values. b max is a number of critical bands included from 

0 to jt in each frequency. 

Next, masking threshold value spectrum Ih^ is 

calculated using the following equation. 

7\ = C f 7] (15) 

Where, 

T, = i0-<"" 0) (16) 

0,=a(14.5 + /) + (l-a)5.5 (17) 

a = m\n[{NG/R),\.0] (18) 

A/c = ioi oglo n[/-^] (19J 

Here, is an i-th k parameter, and it is 
calculated by transforming a linear prediction 
coefficient input from the LPC analyzer 630 using a 
• well-known method. M is a number of order of linear 
prediction analysis. 

Considering absolute threshold values, a masking 
threshold value spectrum is represented as below. 

T\^m^[T i ,absth i ] (20) 
Where, absthj^ is an absolute threshold value in an 

i-th critical band, it can be referred to Reference 

No. 7. 

Next, transforming the frequency axis from the bark 
axis to the Hz axis, a power spectrum P m (f) to masking 
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threshold value spectrum T-i <i = l...b max ) is obtained. 
By performing inverse FFT, auto-correlation function 
r(j) <j = 0...N-l) can be calculated. 

Continuously, by performing a well-known linear 
prediction analysis to the auto-correlation function, a 
filter coefficient bifi^l.^.p) is calculated. 

The auditory sense weighting circuit 720 conducts 
auditory sense weighting 

Using the filter coefficient b if the auditory sense 
weighting circuit 720 performs filtering of supplied 
voice signals with a filter having the transfer 
characteristics specified by Equation (21), then 
performs auditory sense weighting to the voice signals 
and outputs a weighted signal X wm (n). 



An impulse response calculating circuit 670 
calculates impulse response h^ln) of a filter having 
transfer characteristics of Equation (22) in a preset 
length, and outputs a result. 




(21) 



Where, y x and y 2 are constants for controlling 
weighting quantity, they usually meets 0^y 2 </ 1 ^l. 



(22) 



Where, 
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1/A(z) = ljl-g^'J (23), 

and a^* is output from the LSP quantization circuit 640. 

A subtracter 690 subtracts the output of the 
synthetic filter 795 from a weighted signal and outputs 
a result. 

An adaptive codebook 710 inputs the weighted 
impulse response h wn (n) from the impulse response 
calculating circuit 670, a weighted signal from the 
subtracter 690, respectively. Then, it performs pitch 
prediction based on long-term correlation, calculates 
delay M and gain (3 as pitch parameters. 

■In the following explanations, the prediction order 
of the adaptive codebook is supposed as 1, however it is 
supposed as 2 or more. For calculations of delay M in an 
adaptive codebook can be referred to the above-mentioned 
Reference No.l and No. 2. 

Successively, gain/? is calculated and an adaptive 
code vector x 2 (n) is calculated, according to the 
following equation, to be subtracted from the output of 
subtracter 690. 

*.(*> = *«(") " fi # v(n - M) * (24 ) 

Where, x wm (n) is an output signal of the subtracter 
690, v(n) is a past synthetic filter driving signal, h^ 
(n) is output from the impulse response calculating 
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circuit 670. The symbol *. represents convolution 
integration. 

A bit allocating circuit 715 inputs a masking 
threshold value spectrum T^, T 1 ^ or T"^ . Then, it 
performs bit allocation according to the Equation (25) 
or the Equation (26) . 

nsMRJ jnns^J ) (25) 
Rj = r + i/2iog 2 {| n smrAi\ n n smr~ j (26) 

Where, to set the number of bits of whole frame to 
a preset value as shown by the Equation (27), the number 
of bits is adjusted so that the allocated number of bits 
of subframes is in the range from the lower limit number 
of bits to the upper limit number of bits. 

y-i (27) 
R* n <Rj<R mtt 

Where, Rj, R T , R max represent the allocated 

number of bits of j-th subframe, the total number of 
bits of whole frames, the lower limit number of bits of 
a subframe and the upper limit number of bits of the 
subframe, respectively. L represents a number of 
subframes in a frame. 

As a result of the above processings, bit 
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allocation information is output to the multiplexer 790. 

The excitation codebook searching circuit 730 
having codebooks 750 to 750N of which numbers of bits 
are different from others inputs allocated numbers of 
bits of respective subframes and switches the codebooks 
(750! to 750 N ) according to the number of bits. And it 
selects an excitation code vector that minimizes the 
following equation. 

= £[**.<*>-'» • c k (n) * (28) 

<i=0 

Where, y ^ is an optimal gain to a code vector 
Cjc< n > ( j = • .2 B -1, where B is the number of bits of 
excitation codebook). The h wm (n) is an impulse response 
calculated with the impulse response calculator 670. 

It is possible, for example, to prepare the 
excitation codebook using Gaussian random number as 
shown in Reference No.l, or by training in advance. For 
the codebook configuration method by training, for 
example, it is possible to refer to the paper titled "An 
Algorithm for Vector Quantization Design" (IEEE Trans. 
COM-28, pp. 84-95, 1980) by Linde et al . 

The gain codebook searching circuit 760 searches 
and outputs a gain code vector that minimizes the 
following equation using a selected excitation code 
vector and the gain codebook 770. 
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£ = yT^(«)-ft 4 v(/i-r)*/ U (/ I ) 

Where, g lk , g 2k are k-th quadratic gain code vectors. 

Next, indexes of the selected adaptive code vector, 
the excitation code vector and the gain code vector are 
output . 

The multiplexer 790 combines the outputs of the LSP 
quantization circuit 640, the bit allocating circuit 715 
and the gain codebook searching circuit 760 and outputs 
a result. 

The synthetic filter circuit 795 calculates 
weighted regeneration signal using an output of the gain 
codebook searching circuit 760, and outputs a result to 
the subtracter 690. 

The above is the explanation of the sixth 
embodiment . 

Next, the seventh embodiment is explained. 

Fig. 7 is a block diagram showing the seventh 
embodiment . 

Explanation for a component in Fig. 7 referred by 
the same number as that in Fig. 6 is omitted, because it 
operates similarly to that of Fig. 6. 

A subbanding circuit 800 divides voice signals into 
a preset number of bands, w, for example. 

The band width of each band is set in advance. QMF 



(29) 
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filter banks are used for subbanding. For configurations 
of the QMF filter banks, it is possible to refer to the 
paper titled '"Multirate digital filters, filter banks, 
polyphase networks, and applications: A tutorial" (Proc. 
IEEE, pp. 56-93, 1990) by P . Vaidyanathan et al . 
{Reference No. 16) . 

The masking threshold value calculating circuit 910 
calculates masking threshold values of each critical 
band similarly to the masking threshold value 
calculating circuit 705. Then, according to the Equation 
(30) , it calculates SMR^ j using masking threshold values 
included in each band subbanded with the subbanding 
circuit 800, and outputs a result to the bit allocating 
circuit 920. 

SMR^P^/T, (30) 

In addition, it calculates filter coefficient 
from masking threshold values included in each band in 
the same manner as that in the masking threshold value 
calculating circuit 705 of Fig. 6, outputs a result to 
the voice coding circuits 900! to 900 w . 

According to the Equation (31), the bit allocating 
circuit 920 allocates a number of bits to each subframe 
and band using SMR k j ( j=l . . . L, k = l...W) supplied by the 
masking threshold value calculating circuit 910, outputs 
a result to the voice coding circuits 900^^ to 900 w . 
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* v = /?+!/21og 2 {[5M/? v ]/[n SMR^ L ) (31) 

Where, k and j of R k j represent j-th subframe and 
k-th band, respectively. Here, j=l...L, k=l...W. 

Fig. 8 is a block diagram showing configurations of 
the voice coding circuits 900! to 900 w . 

Only the configuration of the voice coding circuit 
900 x of the first band is shown in Fig. 8, because all of 
the voice coding circuits 900! to 900 w operate similarly 
each other. Explanation for a component in Fig. 8 
referred by the same number as that in Fig. 7 is omitted, 
because it operates similarly to that of Fig. 7. 

The auditory sense weighting circuit 720 inputs the 
filter coefficient b± for performing auditory sense 
weighting, operates in the same manner as the auditory 
sense weighting circuit 720 in Fig. 7. 

The excitation codebook searching circuit 730 
inputs the bit allocation value R k j for each band, and 
switches number of bits of excitation codebooks. 

This is explanation for the seventh embodiment. 

Next, the eighth embodiment is explained. 

Fig .9 is a block diagram showing the eighth 
embodiment. Explanation for a component in Fig. 9 
referred by the same number as that in Fig. 7 or Fig. 8 is 
omitted, because it operates similarly to that of Fig. 7 
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or Fig. 8. 

The excitation codebook searching circuit 1030 
inputs bit allocation values for each subframe and band 
from the bit allocating circuit 920, and switches 
excitation codebooks for each band and subframe 
according to the bit allocation values. It has N kinds 
of codebooks of which number of bits are different, for 
respective, bands. For example, the band 1 has codebooks 
1000 n to 1000 1N . 

In addition, for each band, impulse responses of 
concerned subbanding filters are convoluted in all code 
vectors of a codebook. In the band 1, for example, 
impulse responses of the subbanding filter for the band 
1 are. calculated using Reference No. 16, they are 
convoluted in advance in all code vectors of N codebooks 
of band 1. 

Next, bit allocation values for respective bands 
are input for respective subframes, a codebook according 
to the number of bits is read out, code vectors for all 
bands (w, for this example) are added and a new code 
vector c(n) is created according to the following 
Equation (32) . 

w 

c(") = 5> f (n) (32) 
i-i 

Then, a code vector that minimizes the Equation 



2137756 

- 33 - 



(28) is selected. 

If searching is done for all possible combinations 
for all bands of a codebook of each band, tremendous 
computational operations are needed. Therefore, it is 
possible to adopt a method of subbanding output signals 
of adaptive codebooks, selecting a plurality of 
candidates of code vectors of which distortion is small 
from concerned codebooks for each band, restoring 
codebooks of all bands using Equation (32) for each 
combination of the candidates in all bands, and 
selecting a code vector that minimizes distortion from 
all combinations. With this method, computational 
complexity for searching code vectors can be remarkably 
reduced. 

In the above embodiment, for deciding bit 
allocation method, it is possible a method of clustering 
SMR in advance, designing codebooks for bit allocation, 
in which SMR for each cluster and allocation number of 
bits are configured in a table, for a preset bit number 
(B bits, for example) , and using these codebooks for 
calculating bit allocation in the bit allocating 
circuit. With this configuration, transmission 
information for bit allocation can be reduced because 
bit allocation information to be transmitted is enough B 
bits for a frame. 
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Moreover, in the seventh and eighth embodiments, 
Equation (33) can be used for bit allocation for each 
subframe and band. 

* v = * + l/21og a {[ft$M^ (33) 

Where, Q k is a number of critical bands included in 
k-th subband. 

It is to be noted that, in the above embodiments, 
examples of adaptively allocating numbers of bits of 
excitation codebooks are shown, however, the present 
invention can be applied to bit allocation for LSP 
codebooks, adaptive codebooks and gain codebooks as well 
as excitation codebooks. 

Furthermore, as a bit allocating method in the bit 
allocating circuits 715 and 920, it it possible to 
allocate a number of bits once, perform quantization 
using excitation codebooks by the allocated number of 
bits, measure quantization noises and adjust bit 
allocation so that Equation (34) is maximized. 

MNRj=\nSMRj loj (34) 

Where, * n j 2 is a quantization noise measured by 
using j-th subframe. 

Moreover, as a method for calculating of the 
masking threshold value spectrum, other well-known 
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methods can be used. 

Next, the ninth embodiment is explained. 

Fig .10 is a block diagram showing the ninth 
embodiment. Explanation for a component in Fig. 10 
referred by the same number as that in Fig. 7 is omitted, 
because it operates similarly to that of Fig. 7. 

In the ninth embodiment, a multipluse calculating 
circuit 1100 for calculating multipulses is provided 
instead of the excitation codebook searching circuit 
730. 

The multipluse calculating circuit 1100 calculates 
amplitude and location of a multipulse based on the 
Equation (1) in the same manner as the embodiment 4. 
But, a number of multipulses is dependent on the number 
of multipulses from the bit allocating circuit 715. 



CA 02137756 1999-02-26 



THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE 
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS: 

1. A voice coder comprising: 

masking calculating means for calculating maskinq 
threshold values from supplied discrete voice signals based on 
auditory sense masking characteristics! 

auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
weighting input signals based on said filter coefficients; 

a codebook which includes a plurality of code vectors; 
and a searching means for searching for a code vector in the 
codebook that minimizes error signal power between an output 
signal of said auditory sense weighting means and the code 
vectors in said codebook. 

2. The voice coder of claim 1, wherein said codebook is 
an excitation codebook. 

3. The voice coder of claim 1, wherein said codebook is 
an adaptive codebook. 

4. The voice coder of claim 1, further comprising a 
subbanding means for subbandlng said voice signals, wherein 
said auditory sense weighting means performs weighting to 
signals that have been subbanded with said subbanding means. 
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5 - The voice coder of claim 4, further comprising: 

a bit allocating means for allocating quantization bits 
to the subbanded signals; and 

switching means for switching a number of bits of said 
codebook according to bits allocated with said bit allocating 
means . 

6 - The voice coder of claim l, further comprising a 
subframe generating means for dividing said voice signals into 
frames of a first pre-set time length and generating subframes 
by dividing said frames into second pre-set time length 
divisions wherein searching of said codebook is performed for 
each said subframe. 

7 - A voice coder comprising: 

dividing means for dividing supplied discrete voice 
signals into first pre-set time length frames; 

subframe generating means for generating subframes by 
dividing said frames into second pre-set time length 
divisions; 

regenerating means for regenerating said voice signals 
for said subframes based on an adaptive codebook; 

masking calculating means for calculating masking 
threshold values for each of said subframes from said voice 
signals based on auditory sense masking characteristics; 

an auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
performing auditory sense weighting to a difference signal 
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formed as a difference between a signal regenerated with said 
regenerating means and said voice signal based on said filter 
coefficients; 

an excitation codebook which includes a plurality of code 
vectors; and 

searching means for searching for a code vector in said 
excitation codebook that minimizes an error signal power 
between said auditory sense weighting means and the code 
vectors in said excitation codebook. 

8- The voice coder of claim 7, further comprisinq a 
subbanding means for subbanding said voice signals, wherein 
said auditory sense weighting means performs weight inq to a 
signal that has been subbanded with said subbanding means. 

9- The voice coder of claim 8, further comprisinq j 

bit allocating means for allocating quantization bits to 
the subbanded signals; and 

switching means for switching a number of bits of said 
excitation codebook according to bits allocated with said bit 
allocating means. 

10- The voice coder of claim 7, further comprising 
spectral parameter calculating means for calculating and 
outputting a spectral parameter representing a spectral 
envelope of said voice signal for each frame. 
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11 • The voice coder of claim 7. wherein said 

regenerating means calculates, for each of said subframes, a 
Pitch parameter so that a signal regenerated based on said 
adaptive codebook which Includes past excitation signals 
approximates said voice signal. 

12. The voice coder of claim 7, wherein said adaptive 

codebook means calculates, for each of said subframes, a pitch 
parameter so that a signal regenerated based on said adaptive 
codebook which includes past excitation signals approximates 
said voice signal. 

13 • A voice coder comprising: 

dividing means for dividing supplied discrete voice 
signals into pre-set time length frames; 

subframe generating means for generating subframes by 
dividing said frames into pre-set time length divisions; 

masking calculating means for calculating masking 
threshold values for each of said subframes from said voice 
signals based on auditory sense masking characteristics; 

auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
performing auditory sense weighting to said voice signals 
based on said filter coefficients; 

adaptive codebook means for calculating an adaptive code 
vector that minimizes power of a difference signal formed as a 
difference between a response signal and a voice signal 
weighted with said auditory sense weighting means; 
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an excitation codebook which includes a plurality of 
excitation code vectors; and 

searching means for searching for a code vector in said 
excitation codebook that minimizes an error signal power 
between an output signal generated from said adaptive codebook 
means and said difference signal. 

14 • The voice coder of claim 13, further comprising a 

subbanding means for subbanding said voice signals, wherein 
said auditory sense weighting means performs weighting to 
signals subbanded with said subbanding means. 

The voice coder of claim 14, further comprising: 
bit allocating means for allocating quantization bits to 

the subbanded signals; and 

switching means for switching a number of bits of said 

excitation codebook according to bits allocated with said bit 

allocating means. 



IS. The voice coder of claim 13, further comprising 

spectral parameter calculating means for calculating and 
outputting, for each of said frames, a spectral parameter 
representing a spectral envelope of said voice signals. 

17 • The voice coder of claim 13, comprising a spectral 

parameter calculating means for calculating and outputting, 
for each of said frames, a spectral parameter representing 
spectral envelope of said voice signals. 
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18. A voice coder comprising: 

dividing means for dividing supplied discrete voice 
signals into pre-set time length frames j 

subframe generating means for generating subframes by 
dividing said frames into pre-set time length divisions; 

regenerating means for regenerating said voice signals 
for each of said subframes based on an adaptive codebook; 

masking calculating means for calculating masking 
threshold values from said voice signals based on auditory 
sense masking characteristics; 

auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
performing auditory sense weighting to an error signal formed 
as a difference between said voice signal and a signal 
regenerated with said regenerating means based on said filter 
coefficients; and 

calculating means for calculating a multi-pulse that 
minimizes an error signal power between an output signal of 
said auditory sense weighting means and said code vectors in 
said adaptive codebook. 

19. The voice coder of claim 18, further comprising a 
subbanding means for subbanding said voice signals, wherein 
said auditory sense weighting means performs weighting to a 
signal subbanded with said subbanding means. 

20. The voice coder of claim 19, further comprising: 

a bit allocating means for allocating guantization bits 
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to subbanded signals; and 

a switching means for switching a number of bits of said 
excitation codebook according to bits allocated with said 
allocating means. 

21 • A method for searching a codebook used for codinq 

discrete voice signals, using signals weighted with masking 
threshold values calculated from said voice signals based on 
auditory sense masking characteristics, the method comprising 
the steps of: 

(a) dividing said voice signals into pre-set time length 
frames ? 

(b) generating subframes by dividing said frames Into 
pre-set time length divisions- 

(c) regenerating said voice signals for each of said 
subframes based on an adaptive codebook; 

(d) calculating masking threshold values from said voice 
signals based on auditory sense masking characteristics; 

ie) calculating filter coefficients based on said maskinq 
threshold values and performing auditory sense weighting to an 
error signal between a signal regenerated in the step (c) and 
said voice signal, based on said filter coefficients; and 

(f ) searching for an excitation code vector in an 
excitation codebook that minimizes the error signal power 
weighted in the step (e). 

22. The method for searching a codebook of claim 21, 

further comprising the step of: 
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<g) calculating a multi-pulse that minimizes the error 
signal power weighted in the step (e), instead of the step 
(f ) . 

23 • The method for searching a codebook of claim 21, 

further comprising the step of: 

(g) subbanding said voice signals, wherein the step (d) 
performs weighting to the subbanded .signals . 

24. The method for searching a codebook of claim 23, 
further comprising the step of; 

(h) allocating guantization bits to the subbanded 
signals; and 

(i) switching a number of bits of said excitation 
codebook according to bits allocated in the step (h). 

25. The method for searching a codebook if used for 
coding discrete voice signals, using signals weighted with 
masking threshold values calculated from seid voice signals 
based on auditory sense masking characteristics, the method 
comprising the steps of: 

(1) dividing said voice signals into pre-set time length 
frames; 

(2) generating subframes by dividing said frames into 
pre-set time length divisions; 

(3) calculating masking threshold values from said voice 
signals based on auditory sense masking characteristics; 

(4) calculating filter coefficients based on said 
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masking threshold value and performing auditory sense 
weighting to said voice signal based on said filter 
coefficients; 

(5) calculating, for each of said subframes and using a 
difference signal formed as a difference between a response 
signal and a voice signal weighted in the step (4), an 
adaptive code vector that minimizes a power of said difference 
signal, and regenerating said voice signal; and 

(6) searching for an excitation code vector in an 
excitation codebook that minimizes an error signal power 
between a signal regenerated in the step (5), and said voice 
signal . 

26. The method for searching a codebook of claim 25, 
further comprising the step of: 

(7) calculating a multi-pulse that minimizes the error 
signal power weighted in the step (5) , instead of the step 
(6) . 

27. The method for searching a codebook of claim 25, 
further comprising the step of; 

(7) subbanding said voice signals, wherein the step (4) 
performs weighting to the subbanded signals. 

28. The method for searching a codebook of claim 27, 
further comprising the step of: 

(8) allocating quantization bits to the subbanded 
signals; and 
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(9) switching a number of bits of said excitation 
codebook according to bits allocated in the step (8). 

29 • A voice coder comprising: 

dividing means for dividing supplied discrete voice 
signals into frames of a first pre-set time length and furthe 
dividing said frames into subframes of a second pre-set time 
length smaller than said first pre-set time length; 

masking calculating means for calculating masking 
threshold values from said voice signals based on auditory 
sense masking characteristics; 

a plurality of codebooks of which bit numbers are 
different from each other; 

bit number allocating means for allocating a number of 
bits of said codebooks based on said masking threshold values 
and 

searching means for searching a code vector by switching 
said codebooks for each of said subframes based on the 
allocated number of bits. 

30. The voice coder of claim 29, wherein said codebooks 
are excitation codebooks. 

31. The voice coder of claim 29, wherein said codebooks 
are gain codebooks. 



32 • The voice coder of claim 29, further comprising a 

subbanding means for subbanding said voice signals. 
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33. The voice coder of claim 32, wherein impulse 
responses of subbanding filters are convoluted in each of said 
codebooks . 

34. The voice coder of claim 29, further comprising an 
auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
conducting auditory sense weighting to said voice signals 
based on said filter coefficients. 

35. A voice coder comprising: 

dividing means for dividing supplied discrete voice 
signals into frames of a pre-set time length; 

masking calculating means for calculating masking 
threshold values from said voice signals based on auditory 
sense masking characteristics; 

pitch calculating means for calculating pitch parameters 
so as to make signals regenerated based on said adaptive 
codebooks made of past excitation signals approximate, for 
each of said subframes, said voice signals; 

auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
conducting auditory sense weighting to error signals between 
signals regenerated with said pitch calculating means and said 
voice signals based on said filter coefficients; 

a plurality of excitation codebooks of which bit numbers 
are different from each other; 

bit allocating means for allocating a bit number of said 
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excitation codebooks for each of said subframes based on said 
masking threshold values- and 

searching means for switching said excitation codebooks 
for each of said subframes based on the allocated number of 
bits and searching for an excitation code vector minimizing ai 
error signal power between an output signal generated from 
said auditory sense weighting means and code vectors in a 
switched excitation codebook. 



36. The voice coder of claim 35, further comprising 
subbanding means for subbanding said voice signals, wherein 
said bit allocating means allocates a bit number to subbanded 
signals . 

37. The voice coder of claim 36, wherein impulse 
responses of subbanding filters are convoluted in said 
codebooks . 



38 • A voice coder comprising: 

dividing means for dividing supplied discrete voice 
signals into frames of a first pre-set time length and further 
dividing said frames into subframes of a second pre-set time 
length smaller than said first pre-set time length; 

masking calculating means for calculating masking 
threshold values from said voice signals based on auditory 
sense masking characteristics; 

deciding means for deciding a number of multlpulses for 
each of said subframes based on said masking threshold values; 
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and 

means for representing excitation signals of said voice 
signals in a form of multipulse using the number of 
multipulses decided for each of said subframes. 

39. The voice coder of claim 38, further comprising 
subbanding means for subbanding said voice signals, wherein 
said deciding means decides the number of multipulses for each 
subbanded signal. 

40. The voice coder of claim 38, further comprising an 
auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
conducting auditory sense weighting to said voice signals 
based on said filter coefficients. 

41. A voice coder comprising: 

dividing means for dividing supplied discrete voice 
signals into frames of a first pre-set time length; 

means for generating subframes by dividing said frames 
into divisions of a second pre-set time length; 

masking calculating means for calculating masking 
threshold values from said voice signals based on auditory 
sense masking characteristics; 

pitch calculating means for calculating pitch parameters 
so as to make signals regenerated based on said adaptive 
codebooks made of past excitation signals approximate, for 
each of said subframes, said voice signals; 
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auditory sense weighting means for calculating filter 
coefficients based on said masking threshold values and 
conducting auditory sense weighting to error signals between 
signals regenerated with said pitch calculating means and said 
voice signals based on said filter coefficients; 

deciding means for deciding a number of multipulses for 
each of said subf rames based on said masking threshold values; 
and 

means for calculating a multipulse minimizing said error 
signal power using the number of multipulses decided for each 
of said subf rames and representing excitation signals of said 
voice signals using said multipulse. 

42. a method of searching codebooks comprising the steps 

Of: 

(a) dividing supplied discrete voice signals into frames 
of a first pre-set time length and further dividing said 
frames into subf rames of a second pre-set time length; 

(b) calculating masking threshold values from said voice 
signals based on auditory sense masking characteristics; 

{c) allocating a bit number of a codebook to each of 
said subf rames ; and 

(d) searching for a code vector for each of said 
subframes using a codebook having the allocated bit number. 

43. The method of searching codebooks of claim 42, 

wherein said codebooks are excitation codebooks. 
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44. The method of searching codebooks of claim 42. 
wherein said codebooks are qain codebooks. 

45. The method of searching codebooks of claim 42, 
wherein the step (a) is a step of dividing and subbanding 
supplied discrete voice signals into frames of the first pre- 
set time length and further dividing said frames into 
subframes of the second pre-set time length, and the steps (b) 
to {d) are conducted in each band. 

46. The method of searching codebooks of claim 45, 
wherein impulse responses of subbanding filters are convoluted 
in advance. 

47. A multlpulse calculating method comprising the steps 
of: 

(a) dividing and subbanding supplied discrete voice 
signals into frames of a first pre-set time length and further 
dividing said frames into subframes of a second pre-set time 
length; 

(b) calculating masking threshold values from said voice 
signals based on auditory sense masking characteristics, and 
dividing supplied discrete voice signals into frames of the 
first pre-set time length and further dividing said frames 
into subframes of the second pre-set time length; 

(c) deciding a number of multipulses for each of said 
subframes based on said masking threshold values; and 

(d) calculating a multlpulse minimizing said error 
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signal power using a number of multipulses decided for each of 
said subframes and representing excitation signals of said 
voice signals using said multipulse. 

48. The multipulse calculating method of claim 47. 

wherein the step (a) is a step of dividing and subbanding 
supplied discrete voice signals into frames of the first pre- 
set time length and further dividing said frames into 
subframes of the second pre-set time length, and the steps (b) 
to (d) are conducted in each band. 
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